1: Some of are confused about 'pseudo-replication' and 'sampling error' -better called 'sampling variation' IMO. (N.b. 'measurement error' can mean something similar depending on context).

Pseudo-replication’ refers to the case where your sampling design contains observations that are not independent. For example the high frequency of yellow shells at two locations may share a common cause (perhaps there is high gene flow between them, so the yellow frequency drifts up and down in concert). As an aside, fancy statistical methods can allow you to deal with such effects in some cases, but this type of design is best avoided if possible – so is usually best to attempt a design where observations are independent.

‘Sampling variation’ is the unavoidable variability in your sample. It is not an ‘error’, in the sense of something done incorrectly. For example, the frequency of yellow will not be exactly the frequency in the wild population. You did not sample them all: many of the snails are buried or hidden in thick vegetation. You do not expect the frequency in your sample to exactly match the frequency in the wild. However, your statistical test indicates whether differences in the observed frequency at different locations might be due to sampling error.

2: Many of you said that since you took samples from over 20m apart they were independent, as snails don’t travel that far. This is an assumption, but many of you stated this as a fact. I corrected this error on many of the online-introductions – so you should have made this important distinction, which goes to the heart of the scientific method. It is an assumption, based on what others have written about dispersal distances. However, dispersal over several generations can lead to correlations between quite distant sites. The assumption may be right, but you cannot be sure, so your write-up should make it clear that your assumption could be wrong.

3: This is a similar point to 2: you need to be more cautious in presenting your experimental design. Many of you tried to control for environmental variation e.g. by keeping height the same, and only varying the habitat type (or keeping the habitat type constant and only varying the height, etc). Again, many of you made the error that this will have worked as you assumed. You may indeed have successfully controlled for key environmental variables by keeping height constant, but we don’t know for sure that this was successful – as no one knows enough about snail ecology to be sure what the key environmental variables are. You should have said you attempted to control for other environmental variables, not that you did.

4: The number dead and alive depends on many factors – the actual size of the live population, the proportion you saw, the number that die each generation, the number of generations the dead shells last, the proportion of dead shells of each age that you will find. Hence the relative number of live and dead is difficult to interpret, and certainly does not relate directly to death rate.

5: Even if your null hypothesis is false, there may be two or more alternate hypotheses. This is the case for our study, sampling error may not explain the difference between sites, but it may be due to selection, drift, gene flow, some combination etc etc. For this reason, even if the data are unlikely under the null hypothesis, a P-value does not tell us anything about the relative probability of the alternative hypotheses (or even the null, actually… the data could be unlikely under the null, but even more unlikely under the alternate hypotheses). So we avoid saying anything other than something like ‘we reject the null hypothesis at the 5% level’.

6: The reason that Cepaea are a good model system is not really due to their ease of capture; unless you mean our ability to rapidly collect a large sample-size. The direct link from phenotype to known genotype is helpful (‘they wear their genes on their shells’, Jones pers. comm..). Also limited dispersal means that genetic differences accumulate on a small spatial scale – you can study differences within a single field, rather than having to look at whole continents.

7: The null hypothesis is that there is no difference in colour frequencies between certain locations – not that we will observe no differences (or that we observe no significant difference). After all we expect to observe differences even if the frequency is actually the same, due to the sampling variation (which is NOT an error in the sense of ‘mistake’).

8: Several groups chose colour and banding categories, which meant that their expected frequencies in some cell of their matrix were less than 3. This makes the chi-squared test unreliable. They should have combined categories or split the sample up differently.

9: This is very, very important – and I emphasised it over and over again in the lectures. The Chi-square test CANNOT tell you if genetic drift or selection explains your results. You would expect a significant difference in either case (as long as your sampling strategy was well designed and you were not unlucky in your sampling).

10: Some of you seem to have a quasi-mystical belief in the value of quadrats. Quadrats, in themselves, do not make sampling more ‘scientific’. They are helpful if, for some reason, you need to take multiple samples from areas of the same shape and dimensions. That was not the case for our samples, where we simply needed to evaluate the proportion of different coloured snails in an area.

11: Absence of evidence is not evidence of absence. Hence if you see no evidence of a difference between two localities, there may be none – but you may have been unlucky, or not had a large enough sample size to see subtle differences.

12: If you do many Chi-square tests, some of them are expected to be significant even if the null is true. If you do 20, you expect 1 to appear significant at the 5% level under the null hypothesis (this is what the 5% level means).

13: Selection for camouflage is only one possible type of selection on the visible phenotype.

14: If you merge data from several sites, you should first check there are not significant differences between them.

15. The absence of selection does not lead to drift. Drift affects all alleles in all populations, even if they are subjected to selection.

16. If you do not reject the Null hypothesis you do not reject the Alternative Hypothesis as a consequence. You have not evaluated the relative merits of Null and Alternative.